home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
The PC-SIG Library 10
/
The PC-Sig Library - Shareware for the IBM PC and Compatibles (PC-SIG)(Tenth Edition Disks 1-2804)(1991).iso
/
PC_SIGCD
/
07
/
3
/
DISK0731.ZIP
/
FALSEHIT
< prev
next >
Wrap
Text File
|
1986-01-11
|
1KB
|
40 lines
FALSE HITS
INDEX maps words into numbers by applying a
mathematical operation to the letters in each
word. The resulting number is divided by 4093
and the remainder is used as a code to signify
that a file contains a particular word.
Different words can generate the same code.
LOCATE applies the same algorithm to keywords
and retrieves files on the basis of the code.
Consequently, LOCATE may (and does) return files
that do not contain the desired keyword. We
call these "false hits".
That appears to be a problem, and indeed is
a matter to be dealt with.
First, the file list returned does contain
every file in which the keywords appear.
Second, the number of false hits decreases
as the number of keywords increases. The
probability of a false hit approaches zero
quickly.
Third, LOCATE is very fast. It is a simple
matter to re-search the index with an additional
keyword.
Fourth, the desired file is often easily
spotted from the list returned.
Fifth, many programs exist to scan files
for a particular keyword, for example, the MS-
DOS utility "find".
We expect to supply a post LOCATE delivery
program to address the false hit issue in the
second quarter of 1987.